scikit-learn: RepeatedStratifiedKFold
Randomized CV splitters may return different results for each call of split. You can make the results identical by setting random_state to an integer.
「ランダム化されたCV分割器はsplitの呼び出しのたびに異なる結果を返すかもしれない」
「random_state引数に整数を設定することで、結果を全く同じにできる」
理解:繰り返し1回1回の分割は異なるが再現させられる(続く例も参照)
code:example.py
>> import numpy as np
>> from sklearn.model_selection import RepeatedStratifiedKFold
>> X = np.array(1, 2], 3, 4, 1, 2, [3, 4) >> rskf = RepeatedStratifiedKFold(n_splits=2, n_repeats=2, random_state=36851234)
>> for train_index, test_index in rskf.split(X, y):
... print("TRAIN:", train_index, "TEST:", test_index)
...
TRAIN: 1 3 TEST: 0 2 # 1 (0とは違う分割) >>
>> for train_index, test_index in rskf.split(X, y):
... print("TRAIN:", train_index, "TEST:", test_index)
...
TRAIN: 1 2 TEST: 0 3 # random_stateにより、再現している